Multi-rank Sparse Hierarchical Clustering

نویسندگان

  • Hongyang Zhang
  • Ruben H. Zamar
چکیده

There has been a surge in the number of large and flat data sets – data sets containing a large number of features and a relatively small number of observations – due to the growing ability to collect and store information in medical research and other fields. Hierarchical clustering is a widely used clustering tool. In hierarchical clustering, large and flat data sets may allow for a better coverage of clustering features (features that help explain the true underlying clusters) but, such data sets usually include a large fraction of noise features (non-clustering features) that may hide the underlying clusters. Witten and Tibshirani (2010) proposed a sparse hierarchical clustering framework to cluster the observations using an adaptively chosen subset of the features, however, we show that this framework has some limitations when the data sets contain clustering features with complex structure. In this paper, another sparse hierarchical clustering (SHC) framework is proposed. We show that, using simulation studies and real data examples, the proposed framework produces superior feature selection and clustering performance comparing to the classical (of-the-shelf) hierarchical clustering and the existing sparse hierarchical clustering framework.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tensor Sparse and Low-Rank based Submodule Clustering Method for Multi-way Data

A new submodule clustering method via sparse and lowrank representation for multi-way data is proposed in this paper. Instead of reshaping multi-way data into vectors, this method maintains their natural orders to preserve data intrinsic structures, e.g., image data kept as matrices. To implement clustering, the multi-way data, viewed as tensors, are represented by the proposed tensor sparse an...

متن کامل

Low-rank Multi-view Clustering in Third-Order Tensor Space

The plenty information from multiple views data as well as the complementary information among different views are usually beneficial to various tasks, e.g., clustering, classification, de-noising. Multi-view subspace clustering is based on the fact that the multi-view data are generated from a latent subspace. To recover the underlying subspace structure, the success of the sparse and/or low-r...

متن کامل

Robust Multi-View Spectral Clustering via Low-Rank and Sparse Decomposition

Multi-view clustering, which seeks a partition of the data in multiple views that often provide complementary information to each other, has received considerable attention in recent years. In real life clustering problems, the data in each view may have considerable noise. However, existing clustering methods blindly combine the information from multi-view data with possibly considerable noise...

متن کامل

Multi-way clustering of microarray data using probabilistic sparse matrix factorization

MOTIVATION We address the problem of multi-way clustering of microarray data using a generative model. Our algorithm, probabilistic sparse matrix factorization (PSMF), is a probabilistic extension of a previous hard-decision algorithm for this problem. PSMF allows for varying levels of sensor noise in the data, uncertainty in the hidden prototypes used to explain the data and uncertainty as to ...

متن کامل

Multi-view low-rank sparse subspace clustering

Most existing approaches address multi-view subspace clustering problem by constructing the affinity matrix on each view separately and afterwards propose how to extend spectral clustering algorithm to handle multi-view data. This paper presents an approach to multi-view subspace clustering that learns a joint subspace representation by constructing affinity matrix shared among all views. Relyi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014